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Abstract — Motivated by the study of deletion channels, this 
work presents improved bounds on the number of subsequences 
obtained from a binary sting X of length n under t deletions. It is 
known that the number of subsequences in this setting strongly 
depends on the number of runs in the string X; where a run 
is a maximal sequence of the same character. Our improved 
bounds are obtained by a structural analysis of the family of r-run 
strings X, an analysis in which we identify the extremal strings 
with respect to the number of subsequences. Specifically, for 
every r, we present r-run strings with the minimum (respectively 
maximum) number of subsequences under any t deletions; and 
perform an exact analysis of the number of subsequences of these 
extremal strings. 

I. Introduction 

Let X G {0,1}" be a binary string of length n, and let 
f < n be a parameter. In this work, we study the size of the 
set Dt(X) of subsequences of X that can be obtained from 
X via f deletions. The set Df(X) and its size play a major 
role in the design and analysis of communication schemes 
over deletion channels, i.e., channels in which characters of 
the transmitted codeword may be deleted, [3]-[7], [9]. 

The analysis of Dt(X) is challenging as the number of 
subsequences of a string X obtained by deletions does not 
depend only on its length n and the number t of deletions, but 
also strongly depends on its structure. For example, Dt{0") is 
of size 1 and equals the single string 0"^^ while there exist 
strings X for which Df (X) is of size exp(Q(n — t)). Clearly, 
|Df(X)| is at most 2"^' (as after t deletions we remain with 
a binary string of length n — t). 

In his work from 1966, Levenshtein [4] shows (as described 
in [5]) that the number of subsequences |Df(X)| strongly 
depends on the number of runs in the string. Here, a run is a 
maximal sequence of the same character, and the number of 
runs in a given string is denoted r(-). For example r(0") = 1 
while r(0101 . . .01) = n. Specifically, Levenshtein [4] proves 
that 
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Bounding |Df(X)| is addressed by Calabi and Hartnett [1], 
which show that the maximal number of subsequences is 
obtained from certain strings X, denoted cyclic strings C„, 
in which r(X) = |X|. [1] devise a recursive expression for 
|Dt(C„)|, to obtain the bound 
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Figurel. Previous bounds on |Df(X)|. [L] marks the bounds proven by 
Levenstein [4], and [HR] marks the bounds by Hirschberg et. al [2]. Also 
plotted is the naive bound 2"^' which is the possible number of binary strings 
of length n — t. This graph shows an example for the case n = 120, and 
r — 24. All graphs are shown on a logarithmic scale. 



Relatively recently, Hirschberg and Regnier [2] revisit the 
analysis of [1] and obtain an explicit upper bound together 
with an improved lower bound of the form 
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< |Dt(X)| < \Dt{Cn)\. 



Mercier et al. [8] study the setting of small values for f, and 
present explicit formulas for Dt(X) for t < 5. However for 
general values of t the problem remains open. Several of the 
results above generalize also to arbitrary alphabets. 

The bounds of [1], [2], [4] are depicted in Figure [T] for the 
case n = 120 and r = ''(X) = 24 as a function of f. The 
lower bounds of both [2] and [4] depend on the number of 
runs ''(X); and it holds that the lower bound of [2] is superior 
(i.e., larger) to that of [4]. The upper bound of [4] depends 
on r{X), while that of [1], [2] does not. Thus each bound is 
stronger (i.e., smaller) for certain settings of parameters r and 
t. Roughly speaking, the upper bounds of [1], [2] are stronger 
than those of [4] for large values of r and t\ while the opposite 
is true for small r and f. 

A. Our results and proof techniques 

In this work, we continue the study of Dt{X) and present 
improved upper and lower bounds to those described above. 
Our analysis is two fold. We start by studying the family of 
strings X for which r = r{X), and identify the extremal strings 
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Figure 2. Comparision of lower bounds. Our lower bound based on 
unbalanced strings [Theorem |VL2). compared to the previous known bounds. 
[L] marks the lower bound proven by Levenstein [4]. [HR] marks the lower 
bound proven by Hirschberg et. al [2]. This graph shows an example for the 
case n — 300, and r — 200. The logaiithinic presentation emphasizes that we 
obtain an exponential multiplicative improvement. 

in this family with respect to the number of subsequences. 
Specifically, for every r, we identify two r-run strings, referred 
to as the balanced r-run string B,- and the unbalanced r-run 
string Ur such that for every X it holds that 

|Df(!J|,(x)|)| < |Df(X)| < |Df(B|,(x)|)|. (1) 

Loosely speaking, the string Ur = 0101 . . .01"^'+^ is the r- 
run string in which each run is exactly of size 1, except the 
last run which is of size n — t + 1, and is thus referred to as 
'unbalanced' (in the run lengths). The balanced string B,- = 
Qn/ryi/rQu/rin/r _ in/vQn/r jj^e r-run String in which each 
and every run is of equal length n/r. 

To obtain Equation ([T]i, we show that any r-run string 
X can be transformed into the string Ur (alternatively B, ) 
via a series of operations that are monotonic with respect 
to the number of subsequences. The modifications we study 
include a balancing operation, in which given X we shorten 
the length of one of its runs while increasing the length of 
another; a flipping operation, in which a prefix or suffix of X is 
replaced by it complement; and an insertion operation in which 
characters are added to X (see Figures |4(a)[ |4(b)| and |4(c)| i. A 
delicate combination of these (and other) operations enable us 
to establish Equation The modifications we study and their 
analysis shed light on the properties of binary strings under 
the deletion operation and may be of independent interest. We 
note that for the extreme case of r = «, our unbalanced string 
Un is exactly the cyclic string €„; thus we are consistent with 
the result of [1]. 

We then turn to obtain analytic expressions for |Dt(JJ,.)| 
and |Dt(B,.)| of Equation ([T]i. Our expressions are at least as 
good as previous bounds in [1], [2], [4] as they are based on 
specific r-run strings {Ur and Br), and for a large range of 
parameters our bounds are strictly tighter. For our improved 
lower bound, we devise a recursive expression for |Dt(JJ, )| 
and present a closed form formula for its evaluation. We then 
perform an asymptotic evaluation of \Dt{Ur) \ (assuming large 
r). A comparison of our improved lower bound with that 
previously known is depicted in Figure |2] Specifically, we 
show that for values of f which are greater than r/3 our lower 
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Figure 3. Comparison of upper bounds. Our upper bounds based on balanced 
strings [Corollary |IV. 1| , compared to the previous best known bounds. [L] 
marks the upper bound proven by Levenstein [4]. [HR] marks the upper bound 
proven by Hirschberg et. al [2]. This graph shows an example for the case 
n = 120 and r = 24 as a function of t (in logarithmic scale). 

bound improves on those previously known by an exponential 
multiplicative factor of roughly 2*^''^^. 

To address our improved upper bounds, we first present 
a recursive formula for the computation of |Df(Br)|. We 
then extract a closed form solution to our recursive definition 
which yields an exact expression for |Df(B,-)|. For example, 
a numerical comparison of |Dt(Br)| with the upper bounds 
previously known is depicted in Figure [3] for the value of 
n = 120 and r = 24 as a function of t. We note that the ex- 
pression we obtain for |Df(B,.)| involves several summations 
of certain combinatorial expressions. An asymptotic analysis 
of our expression is left open in this work and is subject to 
future research. 

B. Structure 

The remainder of the paper is organized as follows. In 
section |ll] we present the set of structural operations and 
tools we use for comparing and bounding the number of 
subsequences obtained via deletion. This section includes our 
balancing, flipping, and insertion modifications. In Section HIH 
we study our first family of balanced strings, and show that 
(for any given number of runs r and deletions t) they have the 
largest number of subsequences under deletion. In Section HVl 
we analyze the number of subsequences of balanced strings 
and in such obtain our upper bound. In Section [V] we present 
our second family of unbalanced strings, and prove that they 
have the least number of subsequences under any number 
of deletions t. We prove our lower bound by analyzing the 
number of subsequences of unbalanced strings in Section [Vl] 
Finally, in Section IVIII we study the connection between 
subsequences and the closely related notion of deletion patters. 
Using this connection, we show exponential multiplicative 
gaps between our improved upper bound and those previously 
presented. 

II. Tools for analyzing the number of 

SUBSEQUENCES 

The number of subsequences of a string obtained by dele- 
tions highly depends on the string's structure. In order to 
determine the number of subsequences for a given number 
of deletions, it is not enough to know the length of the 



string, and not even the number of the string's runs. Inspired 
by previous works, we looked for tools that will enable us 
to analyze the number of subsequences. In this section we 
present these tools. In subsection III-AI we present a method 
of counting the number of subsequences by partitioning the 
set of subsequences into subsets characterized by their prefix, 
thus forming a recursive relation. In subsection III-BI we 
present basic operations on strings that always increase (or 
decrease) the number of subsequences under deletion. Such 
basic operations allow comparison between the number of 
subsequences of strings, and are very useful for finding bounds 
on the number of subsequences. 

S{xi, . . . , Xf) denotes a binary string with r runs, in which 
the i*^ run is of length x, and the first symbol is 0, E.g. 
S(l,2,3) = 011000. We will use the notation n x a to 
indicate n sequential runs of length a. E.g. S(2, 3 x 1,2) = 
5(2,1,1,1,2) = 0010100. Df(xi,. . will be used as 
short form for Dt{S{xi, . . . , Xy)). Cn denotes the binary 
cyclic string S{n x 1). We assume the following conventions: 
EL/A' = w'l^" i > ^- i'i) = when i < or z > n. 
|Df(X) = 1| for t = \X\ and t = 0, and |Dt(X) = 0| for 
t > \X\. 



A. Partitioning the set of subsequences 

We found the following lemma (from [2]) very useful. We 
restate it here and derive a corollary for binary strings. 

Lemma II.l. [2] For any IL-string X: 

(i) Df(X) = EaezDl''\x), where for a set G of strings 
G^"^ denotes all members ofG starting with a. 



(ii) dJ'''(x) 
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n]), where f{a) 



denotes the index of the first appearance of a in X, and 
X[i : j] denotes the substring x,- . . . Xj of X. 

We derive the following lemma for binary strings. 

Lemma II.2. 

(i) For any binary string X, s.t. X = a'e^Y for some 



i,i > andY e {(t,£}*, \Dt{X)\ 
|Dt_,(e^'"ly)| for any t < JX|. 
(ii) Symmetrically, |Df(YeV')| = 



\Dt{cr'~^£'Y)\ 
\Dt{Yeicr'-^)\ 



Proof: (i) Following the notation of Lemma III. II 



/(cr) = 1 and /(e) = i + Using Lemma IlLTT ii). 

c7Df+i_i(X[2 : n]) and d|"^ = eDf+i_(,-+i)(X[f + 2 : n]). 

Applying Lemma ilTTT i) we get the result. 

(ii) The proof for the symmetric case is identical. ■ 
Applying Lemma III. 21 repeatedly, we get the following 

lemma. 

Lemma n.3. For any binary string S{xi, . . . , x,-), s.t. n = 
TJi=l ^i- 

(i) |Df(xi,...,X,)| = |Dt(x2,...,x,)| +E•il|Dt_,•(x2- 
l,X3,. . .,Xr)| + \\t>n-xi- 

(ii) Symmetrically, \Dt{xi, ... ,Xr)\ = |Df (xi, . . . , x,._i) | + 

E;=i \Dt-i{xi,...,Xr^2,^r-\ " ^)\+Mt>n-Xr- 
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Figure 4. Basic operations on strings. In all diagrams the lower string has 
more subsequences under any number of deletions. 



Proof (i) We denote n = E[=i ^i- Using 
Lemma III. 21 once, we get |Df (xj, . . . , x,-) | = 
\Dt{xi - 1,X2,. . .,X,.)| + |Df_.-tj(x2 - 1,X3,. . .,x,.)|. 
For Xi > 1 we can use Lemma III. 21 again and get 

I Df (Xi, ...,Xr) \ = \Dt{xi-l,X2,...,Xr)\ + \ Dt^^^+i (X2 - 

1,X3, . . .,Xr)| + \Dt~xi{x2 — 1/^3/ •• -/^r)!- Likewise, 
for i < min(xi,« — f), applying Lemma III. 21 j times 
yields |Df (xi, . . . , x,-) | = |Df (xi - ;, X2, ■ • • , ^r) | + 
E'ii^^_j+i\Dt-i{x2-'i-,X3,...,Xr)\. When t < n - x^, it 
follows that min(xi,« — f) = Xi, and so we can expand 
using Lemma HO] exactly Xi times to get \Dt{x-[, . . .,Xr) \ = 

\Dtix2,...rXr)\ + E;=i |C'f_/(x2 - 1,X3,. . .,X,.)|. 

When t > n — Xi after expanding n — t times, we 
get the expression |Df (xj, . . . , x,-) | = |Dt(xi — {n — 

t),X2,...,Xr)\ + E"--L;,j_(„_t)+i |Dt_,-(x2 - 1,X3,...,X,.)|. 

As |S(xi — (n — t),X2, . . . ,Xr)\ — t, and noticing that for 
t > |X|, |Dt(X)| = 0, we get the lemma's claim. 

(ii) The proof for the symmetric case is identical. ■ 

B. Basic operations on strings 

In the following sections we will present families of strings, 
for which the number of subsequences can be explicitly 
calculated. In order to use these families of strings to devise 
bounds on the number of subsequences for general strings, we 
use basic operations on strings, which allows us to transform 
one string into another, while monotonically increasing (or 
decreasing) the number of their subsequences. In this section 
we list such basic operations. 

1) Insertion operation [Figure pl-(a)y .- Hirschberg et al. [2] 
showed that inserting a symbol anywhere in the middle of a 
string always increases the number of subsequences. 

Lemma IIA. [Insertion increases the number of subsequences] 
[2] For any H-strings U, V and any a E I., \DtiUV)\ < 
\Dt{UaV)\. 

2) Deletion chain rule: 



Lemma II.5. For any Tj-string U, and any V e Dt{U), 

Proof: V was obtained from U by deleting t symbols. 
Any string in Dfi{V) is obtained by deleting t' symbols from 
V. The same string can be created by removing the t + t' 
symbols directly from Lf, and thus it belongs also to Df^f/(JJ) 

■ 

3) Flipping operation: [Figure |4(b)| 

Lemma 11.6. [Flipping increases number of subsequences] For 
any binary strings U, V and for any bit a, \Dt{U-aaV)\ < 
\Dt{U(7crV)\, where a denotes the string a in which O's are 
Hipped to 1 's, and vice versa. 

Proof: By induction on |U|. When = the claim 
is \Dt{crcrV)\ < \Dt{(7aV)\. Let V = a'eiX for maximal z,;. 
When j = the claim is trivial (cro'V is a constant string, with 
1 possible subsequence), so we assume j > 0. Using Lemma 
HOI we get \Dt{a(TV)\ = _lDt{(7V)\ + \pt~2^i{ej-^X)\. 
We compare that to jDtiaeV) = |Df(eV)| + |Df_i(y)|. 
|Dt((7l/)| = |Df(ey)| because of symmetry, and since 
ei~^X C Dj^i{V) we can use Lemma III. 51 and get 
|Dt_2-;(^-'^^X)| ^ |C'f_i(V)|, and thus we prove the base 
of the induction. 

Now for the induction step, assume the claim is true for | L7| < 
n and we look at |Lr| = n. We regard the different cases of 
the structure of U. 

Case 1: U = cr'eiX for some i,] > 0. We 
use Lemma III. 21 and get \Dt{o''eiXcrcrV)\ = 
\Dt{(7'-^£jX(7aV)\ + \D_t_i{e'-^XcrcrV)\. We _com- 
pare that to \Dt{a'e'Xi7£V) \ = \Dt{(7'-^e'Xa£V) \ + 
\Dt^i{ei-'^Xa£V)\. On each of the arguments we can 
use our induction claim for |LI| — 1 and — i — 1. 
Case 2: U = e'a^X for some i,j > we use the same 
method. 

Case 3: U = e' for some i > 0. \Dt{e'a(7V)\ = 
|Dt(e'~VcrV)| + \ Dt^i{aV)\. For the flipped string 
we get \Dtie'creV)\ = \Dt{e'^^aeV)\ + |Dt_,'(ey)|. 
In this case, the second argument in both summations 
is equal due to symmetry, and we can compare the first 
arguments using the induction hypothesis for |Lr| — 1. 

Case 4: U = a' for some i > 0. Let V = criX 
for maximal j. In case |X| = we get the trivial 
case of a uniform string again. For |X| > let 
X = eY, and then \Dt{U(TaV)\ = \Dt{(T'+^V)\ = 
\Dt{(7^^^V)\ + \Dt-i-j-2{y)\- Again we compare that 
to \Dt{cr'+^eV)\ = \Dt{(T'eV)\ + |Dt_,-_i(y)|, using 
the induction claim for the first argument, and Lemma 
III. 51 together with symmetry for the second. 

■ 

Corollary II.l. [Alternative proof for the maximality of Cn] 
Given any string X of length n, it can be transformed into 
the string C„ by a series of flipping operation (as defined in 
Lemma III.6D . Each such flip can only increase the number 
of subsequences, and thus we get a proof for the fact that 
Df(X) < Dt(C„). 



4) Balancing operation: [Figure [4(c)| Informally, we refer 
to a string as balanced, if there is a low variability between 
the length of the string's runs. A balancing operation is one 
that decreases that variability. E.g. shortening a long run and 
increasing the length of a short run. The following lemma 
states terms in which balancing a string increases the number 
of its subsequences, and it is used later to prove maximality 
of string families. 

\^tmmsk\\.l .[Balancing increases the number of subse- 
quences] For X = S{x-[, . . . , Xr), and for any t > 0, 1 < 
i < i < r s.t. Xj — Xj > 1, and . . . , Xy_i} is sym- 

metric (i.e. X2 = Xy^i, X3 = Xr-2.---), \Dt{xi, . . . ,Xr)\ < 
\Dt{xi, . . . , Xi — 1, . . . , Xj + 1, Xjj^x, . . . , Xr)\. 

In other words, decreasing the i-th run by 1, and increasing the 
j-th run by 1 can only increase the number of subsequences. 

In order to prove Lemma III. 71 we will need the following 
lemma that characterizes balancing operations near the edges 
of the string. 

Lemma 11.8. Assume {x2, ■ ■ ■ , is symmetric. It follows 

that: 

(i) For X = S{xi, . . . ,Xr) s.t. Xi > Xy, \Dt{xi, . . . Xr)\ < 
\Dt{xi - 1,X2,. . .,Xr^l,Xr + 1)|. 

(ii) For X = S{xi, . . . ,Xr,z) s.t. x\ > Xy and z > 

0, \Dt{xi, . . . Xr,z)\ < \Dt{xi — 1,X2, ■ . ■ ,X,.^i,Xr + 

hz)\. 

(iii) For X = S{y,Xi, . . . ,Xr) s.t. x-[ — Xy > 1 and y > 0, 

\Dt{y,Xi,. . . Xr)\ < \Dt{y,Xi-\,X2,...,Xr_i,Xr + 

1)1- 

(iv) ForX = S{y,xi,. . .,Xr,z) s.t. xi — x,- > 1 andy,z > 0, 

\Dt{y,Xi,...Xr,z)\ < \Dt{y,Xi-\,X2,...,Xr-l,Xr + 

1, z)\. 
Proof: 

(i) When r = 2 the claim is reduced to Dt{xi,X2) < 
Df(xi — 1,X2 + 1) for Xi > X2. This is easily 
proved because Dt{xi,X2) = min(xi, X2, f) + 1. For 
r > 2 we use Lemma llL2l to get |Df(xi,... = 

\Dt{xi - l,X2,...,Xr)\ + \Dt^xi{x2 - I, X-^, . . . , Xr)\. 

Using Lemma III. 21 and the symmetry of {x2, ■ ■ ■ , x^^i} 

we get \Dt{xi - 1,X2,. . ■,Xr-l,Xr + 1)\ = \Dt{Xr + 
1,X2, . . . ,Xr-l,Xi — 1)\ = \Dt{Xr,X2, . ■ . ,Xy_i,Xi — 

1)1 + \ Di^xr-\{^2 ~ 1/^3/ • • ■,Xy_i,Xi — 1) |. We com- 
pare the two expressions. Because of the symmetry 

\Dt{xi-l,X2,...Xr)\ = \Dt{Xr,X2,...,Xr-l,Xi - 

1)|, and because Xi > x,- it is true that S{x2 — 
l,X3,...,Xr) e D;ci-x,.-l(^2 - l,^3/---/^l - 1) and 
thus using Lemma [II. 51 |Dt_j;i(^2 ~ 1/ ^3/ • ■ • / ^i ) | ^ 

\Dt-x,-l{x2-l,X3,...,Xi -1)|. 

(ii) Applying Lemma fll.Sr ii) we get \Dt{xi, . . . Xr,z)\ = 
|Df(a:i,. . . X,)| + EUl\^t^i{xi... Xr^i,Xr - 1)1 + 
l|f>„_z, and \Dt{xi - 1,X2,. . .,x,._i,Xr + l,z)\ = 

\Dt{xi - \,X2,... Xr-i,Xr + 1)| + ELl|Dt-/(^l " 

1,X2, ■ ■ ■ Xr) \ + l|t>n-z- The two expressions are com- 
parable argument by argument using (i) above, noticing 
that if Xi > Xr then definitely Xi > Xy — 1. 



(iii) Applying Lemma III. 31 we get \Dt{y, Xi, . . . Xr)\ = 

\Dt{Xi, . . . Xr)\ + ELll^t-K^l - lrX2,...Xr)\ + 
l\t>n-y, and \Dt{y, Xi - 1,X2,. ■ .,X,.^i,Xr + 1)\ = 
\Dt{xi - 1,X2,... X,._i,Xr + 1)1 + ELi|Dt_/(xi - 

2, X2, . . . Xy_i,Xr + 1)1 + l\t>n-y- The two expressions 
are comparable argument by argument using (i) above 
and noticing that if — x,- > 1 then definitely Xi > Xr 
and Xi — 1 > Xr- 

(iv) We use Lemma III. 31 to get |Df (y, Xi, . . . x^, z) | = 

|Df(xi,... Xr,z)\ + E^i=i |Df_;(xi - 1, X2, . . . Xr,z)| + 
l\t>n-y and |Df(l/,Xi - 1,X2,. . .,X,._i,Xr + l,z)| = 
|Df(xi - 1,X2,. . .,Xr_i,Xr + l,z)| + Ef=i \Dt-i{xi - 

2, X2, ■ ■ ■ , Xj.^l, Xr + l,z)\ + l\t>n-y- The two expres- 
sions are comparable argument by argument using (ii) 
above, as the condition Xi — x,- > 1 guarantees that 
Xi — 1 > X,-. 

■ 

Now we can prove Lemma III. 71 [Balancing increases the 
number of subsequences]: 

Proof: We will prove by induction on the number of 
runs in X outside of the sequence S(x,, . . .,Xj), explicitly on 
(z — 1) + (r — ;') = r + i — i — 1. We will denote these runs 
outer runs. When we have only one outer run, the lemma is 
reduced to Lemma lILSf ii) or lILSt iii). Now we assume that 
there are at least two outer runs. If the outer runs are one on 
each side (i = 2 and j = r — 1) this is the case of Lemma 
Ill.Sr iv). Otherwise, at least on one of the sides there are two 
or more runs (z > 2 or y < r — 1). We assume w.l.o.g that 
i > 2, and then we can use Lemma III. 31 and the induction 
hypothesis on strings with the number of outer runs decreased 
by 1. 

■ 

III. Balanced strings 

In this section we define the family of strings named 
Balanced strings. We call a string balanced, if all the runs 
of symbols in the string are of equal length. Formally, we 
denote by B,. the binary string of length rk, with r runs, 
each of length k. E.g. B34 = S(4,4,4) = 000011110000. 
We will prove that of all strings with length rk and r runs, 
the balanced string has the maximal number of subsequences, 
under any number of deletions. 

Theorem III.l. Let X = S{xi, . . . ,Xr),n = E/=i and k = 
n/r. Ifk is an integer, then |Df(X)| < |Df(B,. )t)|. 

Proof: The main idea of the proof is that any such 
string X can be transformed into B,. ;, by repeatably applying 
the Balancing Lemma IlLTl Each such step can only increase 
the number of subsequences, so if such a series of balance 
operations can be found, the theorem is proved. We will 
construct a series of strings, Xq, . . .,Xm, such that Xq = X, 
Xm = B,^;t and for any < i < m, |Df(X,)| < |Dt(X,+i)|. 
Given a string X/ 7^ B,. j., we denote X; = S(x|''', . . . , x^''). 

We choose a pair {p,q) s.t \Xp^ — x^''| > 1, p < q and 
q — p is minimal. Such a pair exists, because at least one 



TABLE I 

Example of a balancing process as defined in the proof of 
Theorem HTlT] 



i 


X, 


runs 




D6(X;) 





000111111100100 


3,7,2,1,2 


67 


43 


1 


000111111000100 


3,6,3,1,2 


59 


56 


2 


000111110000100 


3,5,4,1,2 


55 


63 


3 


000111110001100 


3,5,3,2,2 


51 


85 


4 


000111100001100 


3,4,4,2,2 


49 


92 


5 


000111100011100 


3,4,3,3,2 


47 


102 


6 


000111000111000 


3,3,3,3,3 


45 


105 



run is of length different from k (w.l.o.g, bigger than k), and 
thus there is at least one other run with length smaller than 
k. Assume w.l.o.g that xj,'"' > x^'', and then we can conclude 
that Xp ' > Xpl^ = Xp^2 = ■ ■ ■ = x^'^^ > x^'\ otherwise we 
get a contradiction to the minimality of {p,q). We will define 
X,+i to be the string achieved from X/ by decreasing the p*'' 
run by 1, and increasing the q*^ run by I. Each pair of strings 
X,, X, + 1 admits to the conditions of Lemma III. 71 and thus 
|Dt(X;)| < |Dt(X,_|_i) |. This process is finite, because the 
value of E;=o ''•f ^ ^'^^ negative integer that must decreases 
at every step. An example of the balancing process we use is 
displayed in Table I. ■ 
We derive the following corollary for the case where n is 
not divisible by r. 

Corollary in. 1. Let X = S(xi, . . .,x,-), n = Ei=i^;. and 
k = n/r. Df(X)< |Df(B,^^j^)|. 

Proof: For integral k this is the case of Theorem 
nil. II Otherwise, we denote a = r\lc \ — n, and let Y = 
\Dt{xi, . . . ,Xr^l,Xr + u)\. Using Lemma llL4l |DffX)| < 
|Dt(y)|, and since |y| = r[fc] and r(Y) = r, using Theorem 

IV. Our Upper bound 

In this section we present an upper bound for the number 
of subsequences of a string obtained by deletions. We develop 
a recursive expression for the exact number of subsequences 
of a balanced string. We then find an explicit form for this 
expression, and use it to obtain a tight upper bound on the 
number of subsequences of a general string. 

A. Recursive expression 

Definition IV.l. ForaUr,k , LetB'^^ be the string obtained from 
Bj- fc by removing the first symbol. E.g. Bjg = S(4, 5, 5) = 

oo'ooiiiiiooooo. 

Definition IV.2. Let Z7(r,A:,t) = |Dt(B,fc)| andb'{r,k,t) = 

LemmalV.l. For all r,k,U \Dt{By^)\ = \Dt{B[^)\ + 

Proof: This is derived from Lemma III. 21 ■ 
When k is known from the context, we will use the short 
notations B,- and B[ for B^ j^ and B^^. repectively. Likewise 
h{r,t) and b'{r,t) denote b{r,k,t) and b'{r,k,t) respectively. 



Lemma TV.2. [Recursive expression for b'] 



b'{r,t) 



b'{r-2,t- 

k-l 

J^b'ir- 



-l,t 



i) 



ift <Oort>kr 
ifk{r -1) <t <kr 



otherwise 



i=0 



EfjTi b'{r -l,t -i) + l|t>;t(r-l)- We check the following 



Proof: Using Lemma HO] we get b'{r, t) = b{r — 1, t) 



W=l 
cases 



(i) t < k{r-l): Using Lemma [ini b{r - l,t) ■ 
b'{r - 1, + b'{r - 2, f - k), and we get b'{r, t) 
b'{r -2,t-k)+ Eto b'{r-l,t- i). 

(ii) t = k{r — 1): In this case t = |B^_i| and b{r 



\, t) = 1. We get b'{r, t) = 1 + E-Zl b'{r -IJ-i). 
(Hi) t>k{r- 1); Here t > |B,_i| and b{r -l,t) = 0. 
We get b'{r, t) = jj^Zl ^'i^ -l,t-i) + 1. 
Rearranging the cases we get the claim of the lemma. ■ 

B. Solving the recursion 

When calculating b'{r,t) we expand the recursive expres- 
sion iteratively, until all b' expressions reach their boundary 
condition, and get zero value. The only positive contribution 
in this sum is from the 1 in the second case ( 1 + Y^iZi ^' ~ 
l,t — i) ). By counting how many times this value is added, 
we can get the explicit value of b'{r,t). The 1 values are 
added exactly every time the second case is used, i.e. when 
expanding the value of b' [f, t) for f, t that fulfill the condition 
k{f — 1) < t < kf. When expanding b'{r,t) these are exactly 
the integral solutions for f = [|J + 1,0 < f < f, which are 
simply the t + 1 pairs (r,-, f,) = (L^ + lJ/0 for < i < t. 
We will count the number of times that b' [f, t) appears in 
the complete expansion of b'{r,t). Based on the recursion 
form in Lemma |IV.2[ the expression b'{f,t) can only appear 
in the single expansion of one of the following expressions: 
b'{f + 1,1 + k), or b'{f + 1,1+ i) when < z < fc - L 
Counting the number of those paths is equivalent to calculating 
the number of possible sets of ordered tuples { (r^-, tj)} selected 
from the set {(2, fc), (1,0), (1, 1), . . (1, fc - 1)} s.t. E''/ = 
r — f and Y^tj = t — t. 

Definition IV.3. We denote as Sj. the set 

{(2,fc), (1,0), (1,1),...,(1,A: - 1)}, and as #P{Ar,At) 
the number of possible sets of ordered tuples {{fj, tj ) } selected 
from the setSj; s.t. J^Tj = Ar and'Etj = At. #Pj{Ar,At) will 
denote the number of such sets using the tuple (2, k) exactly j 
times. 



Lemma IV.3. 



#Po(Ar,Af) 



!=0 



-1)' 



Ar 



Ar 
At - ik 



Proof: In the case of #Pq, the problem is reduced to 
finding the number of ordered partitions of t into r parts. 



each of size no larger than k — 1. The following development 
follows the technique used is [8] in the context of counting 
deletion patterns, similar results are calculated in [10]. This 
partitioning problem can be restated as counting the different 
solutions {y,} to the equations Yli=Qyi = : i// < k. 

The number of solutions ignoring the constraints y, < k 
is equivalent to the number of r-partitions of t, which is 
((()) ~ C^t"^)- The number of solutions that violate the 
constraint yi < A: is ((f^j-))- Subtracting this for each y,-, we 
get (( j)) — r ((flj.))- Now we subtracted too much, because 
solutions that violate two constraints are subtracted twice. The 
number of solutions that violate the two constraints yi < k and 
y2 < A: is ([il2kl)' ^"'^ there are (j) such pairs. Adding these 
cases back to the count we get ((j)) — r ((flj.)) + (2) ((t-2S:)) 
Now again we have to account for the solutions that violate 3 
constraints, that were added too many times, and so on. Putting 

it all together we get #Po(r,f) = E[iJ(-l)'( •) (tlik))- 



Lemma IV.4. 



#P{Ar,At) = J2 



#PoiAr -2i,At - jk) 



Proof: First we calculate #Pj{Ar, At). If we first select 
/ times the fiiple (2, fc),we are left with #Po(Ar - 2j, At - jk) 
ways to select the remaining tuples. We than have C^'^ ^) 
ways to insert the (2, fc) tuples inside the rest, and thus 
#Pj{Ar,At) = {^''r')#Po{Ar - 2i,At - jk). Summing on 



all possible ;-s, #P{Ar,At) 
lemma's claim follows. 

Lemma IV.5. 



Ejlo #Pj{Ar,At) and the 



b'{r,t) = J2#P{r-lT\-ht-i) 

1=0 

Proof: As mentioned in the discussion above, when 
expanding b'{r,t). Exactly f + 1 pairs {f,k) are reached that 
fulfill the conditions k{f - 1) < i < kr, < i < f and 
thus contribute to the sum. These are exactly the t + 1 pairs 
i^irU) = (Li + IJ'O for < f < f, and each one of them is 
reached #P(r — r,-, f — tj) times. Summing all together we get 
b'{r,t) = E[^o#P(r - Li + IJ'f - which is equal to the 
lemma's claim. ■ 

Corollary IV. 1. The combined results of Lemmas II V. 1 1 |IV3] 
IIV.4I and IIV.3I give an explicit expression for |Df(B;. j-)|. We 
restate the results here: 

b{r,t) = b'ir,t) + b'{r-l,t-k) 

b'{r,t) = T!,=o#nr-li\-ht-i) 

At I 

#P(Ar, AO = EjJo { ~')#Po{^r - 2], At - jk) 

#Po{Ar,At)=E\lJi-iy{1'){{^trk 

Using balanced strings we have achieved upper bounds for 
the number of subsequences of general strings. Our bound of 
Corollary |IV. 1 1 (in comparison to previous bounds) is depicted 
in Figure [3] 



V. Unbalanced strings 

In the section we define a second family of strings, named 
unbalanced strings. We call a string unbalanced, if all of 
the runs of symbols in the string are of length 1, except 
for one run. Let 17,^ be a binary string of length n with r 
runs, in which all runs are of length 1, except for the 
run which is of length n — r + 1. We notice that due to 



TABLE II 

Example of a balancing process as defined in the proof of 



symmetry |Dt(!J„^ 



\Dt{ul'))\, and define u[n,r,t) = 
. We will show that these extreme 
cases have the least number of subsequences among the 
unbalanced strings, and conclude that they have the least 
amount of subsequences among all strings. 

Theorem V.l. [Unbalanced strings have the least subse- 
quences] For X = S{xi, . . . ,Xr), n = Yli=i^i' ^nd any 
1 <t <n, |Df(X)| > u{n,r,t). 

Proof: First we will prove that there exists j s.t. 
|Dt(X)| > \Dt{u[}})\, for all t. We notice that the balancing 
operation of Lemma III. 71 can be used in the other direction, 
as an unbalancing operation. We will transform the string 
X into a string u\l^y by repeatably applying the unbalancing 
operation, each such step can only decrease the number of 
subsequences, so by constructing a series of such operations, 
we will prove that Dt(X) > \Dt{U^J})\. Let ; be the index of 
a maximal run in X. We will construct a series of strings, 
Xg, ...,Xm, such that Xg = X, X,„ = Un^r and for any 
< z < m, \Dt{Xi)\ > |Dt(X/+i)|. For any i < m, we 
denote X,- = S{x^'\ . . We choose an index p 7^ / s.t. 

.(0 



iDf(ijt') 



> 1 and all runs between the run and the p'" run are 
all of length 1 . Such an index exists, otherwise X, is already an 
unbalanced string. We define X,_|_i to be the string obtained 
from X, by increasing the f^' run by 1, and decreasing the 
p*^' run by I. Since Xj was the maximal run in X and each 
operation only made it bigger while all other runs could only 
shorten, we have that x^^'"* > Xp The runs between the run 

and the p*^' run are all of length 1, and so trivially symmetric, 
and so the conditions of the reverse Lemma III. 71 holds, and 
|Dt(X,)| > |Df(X,+i)|. 
To complete the proof we will prove that for any 

\Dt{ui'})\ > u{n,r,t). For/ = 1, u{n,r,t) = \Dt{u''^))\ by 
definition. For / > 2 we will prove by induction on /. For / = 

2, \Dt{u\^))\ = \Dt{l,n-r + l,{n-2) x using Lemma 

Owe get \Dt{U^n))\ = |Df(n - r + 1, (n - 2) x 1)| + 
\Di_i{n — r,{n — 2) x We compare this to u{n,r,t) = 
\Dt{{n - 1) X l,n - r + 1)| = |Dt((n - 2) x l,n - r + 

1) 1 + \Dt{{n — 3) X l,n — r+ 1)|. Using the flipping Lemma 

III. 61 on the second addend and symmetry on both, we get 

u{n,r,t) < \Dt{n - r + 1, (n - 2) x 1)| + \Dt{n -r,{n- 
(2) 

2) X 1)1 = \Dt{U„J)\. For the induction step, we assume that 
the claim is true for 2, ...,] — 1 and prove it for j. for / > 2, 



using Lemma HO] we get \Dt{u\il)\ = \Dt{u\i^ll_^)\ + 
\Dt^l{u\l_2\_2)\- Using the induction assumption on both 





TheoremIV.1I 




i 


x, 


runs 


D5(X,) 





0011100111100 


2,3,2,4,2 


60 


1 


0011101111100 


2,3,1,5,2 


38 


2 


0011101111110 


2,3,1,6,1 


26 


3 


0011011111110 


2,2,1,7,1 


20 


4 


0010111111110 


2,1,1,8,1 


14 


5 


0101111111110 


1,1,1,9,1 


10 




1111111110101 


9,1,1,1,1 


8 



addends, we get that \Dt{u\il) \ > \u{n - l,r -l,t) + u{n - 
2,r — 2,t) and using Lemma llOl again, the last sum is equal 
to u{n, r, f) and thus the induction step is proved. An example 
of the unbalancing process is displayed in Table II. ■ 

VI. Our lower bound 

In this section we develop a recursive expression for the 
number of subsequences of an unbalanced string by deletions. 
We will find an explicit form for this expression, and use it 
to obtain a lower bound on the number of subsequences of a 
general string. In addition, we will show the improvement that 
our lower bound provides. 

A. Recursive expression 

Lemma W.l.Forall 0<r<n,0<t<n, 



u{n,r,t) 



r 
2 

d{n,t) 

u{n - l,r,t) + 
d{r-2,t + r-n-i) 



ifr = 1,2 

ifr > 1 and t = n — 1 
ifn = r 



otherwise 



Where d{r,t) = |Df(Q)| = El=oC~i*), as proved in [2]. 
We assume d{n,Q) = 1, and for t < 0,d{n,t) =0. 

Proof: 

• When r = 1, Un,r is a constant string, and has only one 
possible subsequence (the constant string of length n — t). 

• When r = 2, Un,r is of the form ae"^^, and has two 
possible subsequences, namely cre"~l~' and e"^^ 

• When t = n — 1 and r > 1 any subsequence is a single 
symbol. Since r > 1, it can be either symbol of the binary 
alphabet. 

• When n = r, Un,r = Cn, the binary cyclic string of 
length n. \Dt{Cn)\=d{n,t) by definition. 

• In the other cases (2 < r < n, t < n — 1), we 
regard U^^J ("tail first"). We Apply Lemma III. 21 and 



get \Dt{ui]))\ = iDKLI^i^^,; 
u{n - l,r,t) + d{r - 2,t + r - 



ID 



{Cr-2)\ 



!)■ 



B. Solving the recursion 

Theorem VI.l. [Closed form formula for u{n,r, t)] For all t < 

n,2 < r < n, 
(i) when r > t: 

f-2 

u{n,r,t) = d(r,t) + ^ d{r-l,i). 

i=t+r-n-l 



(ii) when r < t: 



r-3 



u{n,r,t)=l+ Yj d{r-l,i). 

i=t+r-n-\ 

Proof: We sequentially expand u{n,r,t) using Lemma 
IVI.ll until reaching one of the boundary conditions. After 
one such expansion we get u(n,r,t) = u{n — l,r,t) + 
d{r — 2,t + r — n — 1), after / expansions (assuming the 
boundary conditions weren't reached) we get u{n,r,t) = 

u{n - j,r,t) +T!it!t+r-i~Ui^ - ^'')- We notice that i = 
t + r — n+j — 2 can be negative, and in these cases d{r — 2, i) 
is defined to be zero. When r > t, after n — r steps we 
get u{n,r,t) = u(r,r,t) + - 2,f), and as 

u{r, r, t) = d{r, t) we get (i) above. When r <t, after n — t — 
1 steps we get u{n, r, t) = u{t + 1, r, f) + E-Zf^^_„_;^ d(r - 
2, 0=2 + E-lf+,_„_i d{r-2, i) and we get (ii) above. ■ 
We notice that when the number of deletions is no greater 
than n — r +1 the expression of u{n, r, t) does not depend on 
n, as stated in the following corollary: 

Corollary VI.l. For 2 < r < n and t < n — r + 1: 

(i) when r > t: 



u{n,r,t) = 
(ii) when r < t: 

u{n,r,t) =2- 



d{r,t) 



r-3 
/=0 



f-2 
i=0 



2,i). 



2,i) = 1 + 2'- 



C. Improving known lower bounds on number of subsequences 

The results of Theorem IV. II together with Theorem IVI.ll 
lead to the following: 

Theorem Vl.l.fJower bound on the number of subsequences] 
For all t < n, 2 < r < n and any r-run string X 



min(f-2,r-3) 

\Dt{X)\>dir,t)+ ^ d{r- 

i=t+r-n-l 



2,i). 



We compare this result to the previous result by Hirschberg 
et al. \Dt{X)\ > d{r,t) = EI^qCTO PI- We limit the 
comparison to t < r as for t > r the previous bound gives 0. 

Lemma VI.2. Let a = t/r.fora e [i ' ^ 



u(n,r,t) 
d(n,i) 



n 



fV02'-(«-3) 

y ra 



and for t < 



Proof: d{r,t) < (f + 1) max[^g Cy')- The series fT*) 
reaches its maximum at / = [{r — t)/2\. This value is 



reached, because t > r/3 implies that t > {r — t)/2. 

Thus d{r,t) < + l)(^(rL7|/2j)- Stirling's approximation 

implies that (^a/2j) = ®("^)' ^"'^ ^^^^ S^*- '^(''/O = 
O(t7=t2'-0. 



On the other hand as t 

-2n ^ rll{r-5)\ 



2 > L- 



r-2 



2,L¥J) > ( 



L3(^-3)J 



) = 0( 



2, 
23' 



u{r,t) 

u{n,r,t) 
d(n,t) 



2 



J, u{r,t) > d{r - 
) = 0(^), thus 



23 ^ 



Jin!) thus 

tVT- d(n,t) 



For large enough strings (n > f + r), the improvement that 
the bound in Theorem I VI.2I gives over the result in [2] depends 
on the ratio between r and t. We depict our improved results 
in Figure |2] 

VII. Deletion patterns 

Consider a string X. Deletion of f letters from X can 
be characterized by partitioning f into the number of letters 
deleted from each run, leading to the following definition of 
deletion patterns. 

Definition VILl. Let X be a string s.t. X = S{xi, . . 
A deletion pattern of size t, is a set of integers {yi, . . . y,-} 
fulfilling YJi^i yi = t and for all < i < r, y -, e [0, x,] . Each 
yi represents the number of letters deleted from the i-th run of 
X. E.g. the deletion pattern [2, 1,2} for the string 000110000 
results in the subsequence 0100. Jet Vt{X) denote the set of 
deletion patterns of size t for the string X. 

It is important to notice that applying different deletion 
patterns on a string can result in the same subsequences. 
E.g. For the string 11011, the deletion patterns {1,1,0} and 
{0,1,1} both result in the subsequence 111. The following 
lemma ties deletion patters with the study of subsequences 
(and appears partially in [8]). 

Lemma Vn.l. For any X = S{xi, . . . ,Xr), let X' denote 
the string S{xi — 1, . . . ,Xr — 1). Informally X' is the string 
obtained by deleting one letter from each run in X. It follows 
that\Vt{X')\ < |Df(X)| < |Pf(X)|. 

Proof: Deleting letters from a given string according 
to a deletion pattern is a deterministic process, and so each 
deletion pattern yields exactly one subsequence, thus the 
right inequality follows. As mentioned before, several deletion 
patterns can yield the same subsequence, but this redundancy 
doesn't exist with deletion patterns that preserve the number 
of the runs in a string, i.e. there isn't a run in which all the 
symbols are deleted. In this case it is possible to reconstruct 
the deletion pattern from the subsequence in a unique way, 
and there is a one-one correspondence between the deletion 
patterns and the subsequences. The group of deletion patterns 
of X that preserve the number of runs is exactly the group of 
deletion patterns in which at least one symbol is not deleted 
from each run, and is equal to Vt{X'). This group has a one- 
one correspondence to the subset of Dt{X) of strings with 
exactly r runs, and thus the left equality holds. ■ 



Lemma VII.2. For any X, |Pf(^)l = l^|x|-t(^)l- 

Proof: Let X = S(xi, . . . , x,-) and let {i/i, . . be 
a f-deletion pattern. It follows that Yli=iyi = We define 
y • = Xj — x/i for all 1 < f < r. As yi G [0, x,] it follows 
that y- e [0,x,] and = YJi=\{^i — Vi) = 1^1 - and 

so {yj} is a (|X| — t)-deletion pattern of X. Each f-deletion 
pattern can be mapped to a (|X| — -deletion pattern, and this 
mapping is reversible, thus [^{(X)! = |'P|x|-t(^)l- • 

A. The number of deletion patterns for balanced strings 

We use the result obtained in Lemma |IV.3| and restate it for 
deletion patterns to get the following result: 

Lemma Vn.3. \P,{B,^,)\ = j:\i\-irQ = 

We now study the multiplicative gap between |Pt(B,.^)| 
and the previous bounds of [1], [2], [4] for values of t 
close to n/2 and sufficiently large r, k. This is an intriguing 
setting for t in the context of deletion channels [3]. It follows 
from basic observations (and also directly from the proof of 
Lemma rVlOl ) that 

|P,(B,,,)|<min(((^)),(fc + l)'-). 

The first bound above is exactly that of [4], while the second 
bound follows from the fact that each y, in a deletion pattern 
is an integer between and k (notice that the former bound 
does not depend on the parameter k while the latter does not 
depend on t). In what follows, we show that the bound of 
[k + iy improves on the bounds in [4] and [1], [2] for values 
of r and k which are sufficiently large. 

For t = n/l = kr/2, the bound of Yfi^Q ("7*) from [2] is 
exactly 2"^^. The bound of C+f"^) from [4] is at least 

Here we use the fact that 

/r(l + a)\ ^ _J_ f (r(l + ^))'-(i+") \ ^ (1 + a)'' (1 + 

derived from Stirling's formula; and the fact that for positive 
X, (1 + 1 1 xy^^ > For c = {k+i){iii\k) ' '■^^ above impUes 
that our bound of {k + ly on |ff(B,-j:)l is superior to that 
given in [4] (and that in [2]) by a multiplicative factor of at 
least 

12k^ 

Notice that for large k, c > (1 + <5) for a constant ^ > 0. 
We conclude that a multiplicative gap of at least that specified 
above also holds between |Df(B^j;)| and the bounds in [1], 
[2], [4]. 

For sufficiently small e > and f = — e), a similar 

analysis will give a gap of ~ for c = (,,+//(i+Y/(^/2-eic)) - 
Here also, for small e and large k; c > {1 + S) for a constant 



S > 0. All in all, we get for values t which are close to n/2 
and for sufficiently large r and k; that |Pf(B,. and thus our 
bound of |Df(B, ;f)|, improves on the bounds of [1], [2], [4] 
by an exponential multiplicative factor of 2^^'''. 

VIII. Concluding remarks 

In this work we present several operations on binary 
strings which are monotone with respect to the number of 
subsequences under deletion. We show, using the operations 
studied, that the balanced r-run string B^ ^ and the unbalanced 
one Un,r obtain the maximum and respectively minimum 
number of subsequences under deletion. By devising recursive 
expressions, we present a precise analysis of the number of 
subsequences of both B,- j. and Un,r under t deletions. For 
our lower bound, we quantify our expressions asymptotically. 
For our upper bound, we analyze deletion patterns to express 
our asymptotic improvement over previous bounds. A direct 
asymptotic analysis of our expression for |Dt(B^j.)| is left 
open in this work and is subject to future research. 

References 

[1] L. Calabi and W.E. Hartnett. Some general results of coding theory with 
applications to the study of codes for the correction of synchronization 
errors. Information and Control, 15(3):235 - 249, 1969. 

[2] D. S. Hirschberg and M. Regnier. Tight bounds on the number of string 
subsequences. Journal of Discrete Algorithms, 1(1): 123-132, 2000. 

[3] I. A. Kash, M. Mitzenmacher, J. Thaler, and J. Ullman. On the zero- 
en'or capacity threshold for deletion channels. CoRR, abs/1 102.0040, 
2011. 

[4] V. I. Levenshtein. Binary codes capable of coiTecting deletions, inser- 
tions, and reversals. Soviet Physics Doklady, 10(8):707-710, 1966. 

[5] V. I. Levenshtein. Efficient reconstruction of sequences from their 
subsequences or supersequences. Journal of Combinatorial Theory, 
Series A, 93(2):3 10-332, 2001. 

[6] H. Mercier. Communication over Channels with Symbol Synchronization 
Errors. PhD thesis. The University of Biitish Columbia, 2008. 

[7] H. Mercier, V.K. Bhargava, and V. Tarokh. A survey of error-correcting 
codes for channels with symbol synchronization errors. IEEE Commu- 
nications Sun'eys and Tutorials, 12(l):87-96, 2010. 

[8] H. Mercier, M. Khabbazian, and V. K. Bhargava. On the number of 
subsequences when deleting symbols from a string. IEEE Transactions 
on Information Theoty, 54(7):3279-3285, 2008. 

[9] M. Mitzenmacher A survey of results for deletion channels and related 
synchronization channels. Probability Sun'eys, 6:1-33, 2009. 
[10] J. Ratsaby. Estimate of the number of restricted integer-partitions. Appl. 
Anal. Discrete Math., 2:222-233, 2008. 



